Duration normalization and hypothesis combination for improved spontaneous speech recognition

نویسندگان

  • Jon P. Nedel
  • Richard M. Stern
چکیده

When phone segmentations are known a priori, normalizing the duration of each phone has been shown to be effective in overcoming weaknesses in duration modeling of Hidden Markov Models (HMMs). While we have observed potential relative reductions in word error rate (WER) of up to 34.6% with oracle segmentation information, it has been difficult to achieve significant improvement in WER with segmentation boundaries that are estimated blindly. In this paper, we present simple variants of our duration normalization algorithm, which make use of blindly-estimated segmentation boundaries to produce different recognition hypotheses for a given utterance. These hypotheses can then be combined for significant improvements in WER. With oracle segmentations, WER reductions of up to 38.5% are possible. With automaticallyderived segmentations, this approach has achieved a reduction of WER of 3.9% for the Broadcast News corpus, 6.2% for the spontaneous register of the MULT_REG corpus, and 7.7% for a spontaneous corpus of connected Spanish digits collected by Telefónica Investigación y Desarrollo.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Duration normalization for improved recognition of spontaneous and read speech via missing feature methods

Hidden Markov Models (HMMs) are known to model the duration of sound units poorly. In this paper we present a technique to normalize the duration of each phone to overcome this weakness, with the conjecture that speech with normalized phone durations may be better modeled and discriminated using standard HMM acoustic models. Duration normalization is accomplished by dropping frames if a phone i...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Hierarchical duration modelling for speech recognition using the ANGIE framework

We describe a novel hierarchical duration model for speech recognition. The modelling scheme is based on the angie framework, a exible uni ed sublexical representation for speech applications. Our duration model captures contextual factors that in uence duration of sublexical units at multiple linguistic levels simultaneously, using both relative and absolute duration information. The modelling...

متن کامل

Word Level Timing in Spontaneous Japanese Speech

This study provides evidence against the hypothesis that Japanese has word level mora-timing. Unlike previous studies which used careful speech, this paper evaluates timing in a corpus of spontaneous Japanese speech from 11 speakers. Correlations between word duration and number of moras in the word are shown to be much lower than in careful speech studies. Furthermore, if there were durational...

متن کامل

Acoustic analysis and automatic recognition of spontaneous children²s speech

This paper presents analyses, and recognition experiments, on spontaneous American English speech collected from children aged from 8 to 13 years. These analyses focused on variations in phone duration and on the scattering of phones in the acoustic space and were aimed at achieving a better understanding of spectral and temporal changes occurring in spontaneous speech produced by children of v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003